Optimality and Scalability in Lattice Histogram Construction
نویسنده
چکیده
The Lattice Histogram is a recently proposed data summarization technique that achieves approximation quality preferable to that of an optimal plain histogram. Like other hierarchical synopsis methods, a lattice histogram (LH) aims to approximate data using a hierarchical structure. Still, this structure is not defined a priori; it consists an unknown, not a given, of the problem. Past work has defined the properties that an LH needs to obey and developed general-purpose approximation algorithms for the construction thereof. Still, two major issues remain unaddressed: First, the construction of an optimal LH for a given error metric is a problem unsolved to date. Second, the proposed algorithms suffer from too high space and time complexities that render their application in real-world settings problematic. In this paper, we address both these questions, focusing on the case that the target error metric is a maximum error metric. Our algorithms treat both the error-bounded LH construction problem, in which the space occupied by an LH is minimized under an error constraint, as well as the classic space-bounded problem. First, we develop a dynamicprogramming scheme that detects an optimal LH under a given maximum-error bound. Second, we propose an efficient, practical, greedy algorithm that solves the same problem with much lower time and space requirements. Then, we show how both our algorithms can be applied to the classic space-bounded problem, aiming at minimizing error under a bound on space. Our experimental study with real-world data sets shows the effectiveness of our methods compared to competing summarization techniques. Moreover, our findings show that our greedy heuristic performs almost as well as the optimal solution in terms of accuracy.
منابع مشابه
GPU-acceleration for Large-scale Tree Boosting
In this paper, we present a novel massively parallel algorithm for accelerating the decision tree building procedure on GPUs (Graphics Processing Units), which is a crucial step in Gradient Boosted Decision Tree (GBDT) and random forests training. Previous GPU based tree building algorithms are based on parallel multiscan or radix sort to find the exact tree split, and thus suffer from scalabil...
متن کاملAn Irregular Lattice Pore Network Model Construction Algorithm
Pore network modeling uses a network of pores connected by throats to model the void space of a porous medium and tries to predict its various characteristics during multiphase flow of various fluids. In most cases, a non-realistic regular lattice of pores is used to model the characteristics of a porous medium. Although some methodologies for extracting geologically realistic irregular net...
متن کاملPerformance Limitations of Flat Histogram Methods and Optimality of Wang-Landau Sampling
We determine the optimal scaling of local-update flat-histogram methods with system size by using a perfect flat-histogram scheme based on the exact density of states of 2D Ising models. The typical tunneling time needed to sample the entire bandwidth does not scale with the number of spins N as the minimal N of an unbiased random walk in energy space. While the scaling is power law for the fer...
متن کاملOn the Role of MMSE in Lattice Decoding: Achieving the Optimal Diversity-vs-Multiplexing Tradeoff
In this paper, we introduce the class of lattice space-time codes as a generalization of linear dispersion (LD) coding. We characterize the diversity-vs-multiplexing tradeoff achieved by random lattice coding with lattice decoding. This characterization establishes the optimality of lattice vertical codes when coupled with lattice decoding. We then generalize Erez and Zamir mod-Λ construction t...
متن کاملThe Local Optimality of the Double Lattice Packing
This paper introduces a new technique for proving the local optimality of packing configurations of Euclidean space. Applying this technique to a general convex polygon, we prove that under mild assumptions satisfied generically, the construction of the optimal double lattice packing by Kuperberg and Kuperberg is also locally optimal in the full space of packings.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 2 شماره
صفحات -
تاریخ انتشار 2009